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SIMPLE METRICS FOR PROGRAMMING LANGUAGES 



Bruce J. MacLennan 
Computer Science Department 
Naval Postgraduate Scfiool 
Monterey, CA 9394^0 



Abstract : 

Several metrics for guiding the design and evaluation of program- 
ming languages are introduced. The objective is to formalize 
notions such as 'size', 'complexity', 'orthogonality', and 'sim- 
plicity'. Three different kinds of metrics are described: syn- 
tactic, semantic, and tr ansformational . 

Syntactic metrics are based on the size of a context-free 
grammar for a language or a part of a language. They can be used 
to judge the size of a language and the relative sizes of its 
parts. These techniques are demonstrated by their application to 
Pascal, Algol-60, and Ada. 

Syntactic metrics make no reference to the meaning of a 
language's constructs. For this purpose we have developed 
several semantic metrics that measure the interdependencies among 
the basic semantic ideas in a language. This technique has been 
applied to the control, data, and name structures of FORTRAN, 
BASIC, Lisp, Algol-60, and Pascal. 

Finally, we suggest that a useful measure of a programming 
language is the complexity of the relationship between its 
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syntactic and se^nantic structures. For this purpose we introduce 
a tr ansf armati onal netric and demonstrate its use on subsystems 
of several languages. 

The paper concludes by discussing the general principles 
underlying all of these metrics and by discussing the proper 
method of validating metrics such as these. 

1 . Introduction 

Since programming languages are the primary tools used in the 
programming process, it is not surprising that the choice of pro- 
gramming language is an important element of the life-cycle cost 
of a software development project. Sometimes the design of a new 
programming language seems the appropriate approach, as has been 
the case with the Ada language for embedded computer applica- 
tions. In either case, it is necessary to be able to compare 
languages and judge their suitability for various applications. 

Programming languages are frequently compared informally. 
One language may be described as more "structured" than another, 
or simpler, or more powerful, or better "human engineered", or 
less procedural, or smaller, or more "orthogonal", and so forth. 
These claims are particularly common in the descriptions of new 
programming languages. 

Unfortunately, there do not exist objective methods for 
validating these claims. A claim that one language is preferable 
to another may be supported by arguments, but these are 
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frequently unconv inc ing . Also, these arguments fail to provide 
any quantitative measure of how languages compare along these 
axes. This eliminates any meaningful evaluation of tne tradeoffs 
among language design decisions. Thus, language comparison and 
evaluation remains a mostly subjective art, not unlike literary 
criticism (see, for example, [1]). This is unsatisfactory for a 
tool of the importance of a programming language. 

2 . Related Work 

The importance of language metrics makes the lack of research in 
this area quite astonishing. Perhaps this can be attributed to 
the relative youth of the craft of language design. Also, it may 
in part result from some of tne problems inherent in formulating 
language metrics; a subject discussed later. In any case, there 
are few reported attempts to place language comparison and 
evaluation on an objective basis. 

One such attempt was reported by Sammet [2] in 1971. This 
approach might be described as "quantified subjectivity." There 
are several steps: first, a list of language properties, such as 
"English-like" and "high-level", is made. Each property is 
assigned a weight depending on its relevance to an application 
(or application class) as judged by the evaluator. In the second 
stage the evaluator judges how well each language satisfies each 
property and assigns a corresponding numeric score. A final 
score for each language is computed by summing the weighted indi- 
vidual scores. Sammet admits that this technique is subjective 
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but claims that it at least has the advantage of making the 
evaluator's biases explicit. 

Other attempts to measure languages can be found in the 
psychological experiments of Gannon C3» and others [5] which 
compare specific language features (such as terminating versus 
separating semicolons) with respect to properties such as reada- 
bility and susceptibility to error. Although these studies are 
valuable, their application will be limited unless psychological 
properties can be related to more general language properties 
(e.g., degree of structure) . 

How might we go about measuring objective language proper- 
ties? What properties are amenable to such measur ements? One 
candidate is the size of a language. It is common to speak of 
one language (say, PL/I) being larger than another (say, Pascal) 
based on a subjective assessment of the number of features in 
each language. The size of the reference manuals may even be 
cited as evidence in such a judgement. A more promising approach 
to comparing language sizes is to compare the size of their gram- 
mars. Since a smaller, more regular language will tend to have a 
shorter grammar than a larger, less regular language, we can 
measure the size of a language by the size of its description in 
a grammar in an appropriate normal form. The grammar itself can 
be measured in a variety of ways (number of tokens, graph- 
theoretic measures, etc.). 
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In the section Introductiori .le descrioed the cse of wOCitex t-f ree 
grammars to measure syntactic complex ity ♦ This is based on the 
idea that the difficulty in learning a language is a function of 
the length of its grammar. The reason for this is that the pro- 
grammer must, in essence, internalize these rules. Section 4 
shows how this approach can be used to measure the total size of 
a language's syntax, and how it can be used to compare the rela- 
tive sizes of a language's parts. 

There must be more to complexity than just grammar size, 
however, since the shortest programming language grammar (for any 
infinite language) is that whose statements are sequences of 
identical tokens, e.g. 

<program> ::= 1 | <program> 1 

The reason that such a language is not simple is that the ^ trans- 
lation mapping programs to their meanings is very complicated. 
We could say that the translation is not continuous (this is more 
than a metaphor if these issues are placed in a lattice-theoretic 
framework). To measure this complexity we use translation gram- 
mars rather than simple generative grammars: the complexity of a 
language is a function of the relation between its syntax and its 
semantics. Measurement of this is accomplished by writing a 
translation grammar that maps the language in question into an 
abstract language that embodies its semantics. The size of this 
translation grammar can then be measured in a variety of ways. 
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This transformational metric is jemonstrateU in Sectisn 5* 

The technique iesorioed above measures the t r ans f orrna t i anal 
complexity of a language, i.e., the complexity of tne relation 
between a language's syntax and semantics, but it does not 
address the complexity of the underlying semantics. That is, a 
language might have a simple grammar that is simply related to 
its semantic constituents, but these semantic constituents might 
themselves be complicated. (Of course, with a continuous trans- 
lation, a complicated semantics will to some extent induce a com- 
plicated syntax.) For instance, we can observe that the data 
structuring methods of Pascal are more elaborate than those of 
Algol-60. How can we measure this fact? 

One technique comes from denotational semantics (see, for 
example, [6], [?])• 3y using these techniques one can formulate 
a set of domain equations that describe, for instance, the data 
types provided by a language. It is then often possible to rank 
the complexity of the data structuring methods provided by 
several languages by comparing the complexity of the associated 
domain equations. To convert this into a quantitative measure it 
is necessary to measure the complexity of these equations quanti- 
tatively. This technique has already been used by the author to 
compare FORTRAN, Algol-60, Pascal, and Ada on the basis of the 
complexity of their data, control, and name structuring facili- 
ties C 8 ] . 

Some subsystems of a language, such as the control struc- 
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tures , are not readily amenable to formulation as iomain equa- 
tions. Thus , a more generally applicable technique has been 
developed. We can observe that all structures in programming 
languages are produced by applying a set of con structors to a set 
of primitives . The various ways in which these constructors can 
be combined can be represented as grammar-like rules or as simple 
graphs. More formally, sets of structures can be taken as 
objects, and constructors as morphisras, in a category correspond- 
ing to the structural system. 

How does this permit comparison or evaluation of languages? 
Intuitively, we might expect the complexity of a structural sys- 
tem to be related to the number of dependencies between parts of 
the system. These are represented by the number of morphisms, or 
by the number of edges in the diagram representing the system. 
Therefore, by ranking the complexity of the diagrams, we have an 
ordinal measure for system complexity, and by counting the edges 
in the diagram, we have a cardinal (quantitative) measure of com- 
plexity. Of course there are many other measures that can be 
applied to graphs, and several of these are investigated in Sec- 
tion 5. 



The important issue of the validation of programming 
language metrics is discussed briefly- in Section 7. 

4 . Syntactic Metrics 

We define a context-free grammar G to be a quadruple. 
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G = <T, N, P, g> 

where T is a finite set of terminal symbols, N is a finite set of 

* 

non - terminal symbols, V = TUN is the vocabulary , P C NXV is a 
finite set of productions , and g€N is the goal s ymbo 1 . We use 
lower-case letters for elements of the vocabulary and upper-case 
letters for sequences and sets. 



For a string S in V*, let |S| be the length of S. Then, we 
define the size I rr I of a production n = <n,S> in P as | n 1 -t- 1 S | = 
lSj+1. The size 1G| of a context-free grammar G is defined 



I 

I 



GI 



2 Iff! 

rr 6 P 



P + 



2 

<n , S> € 



!S 

P 



I 



where p = IP! is the cardinality of P. This definition of size 
is essentially the same as S(G) defined in [9] and [10]. We also 
define R(G) = |G!-p to be the total size of the right-hand sides 
of the productions. 



The size of a context-free grammar is easy to determine from 
its written form. For example, to determine the size of the 
grammar with these productions: 



g = h 

g = 
h = i 
h = sjg 
j = i 
j = Ji 
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we si.nply count all the tokens except for the equal-signs. In 
this case the size is 16. 

Context-free grammars may be written in various kinds of 
extended notations. For example, the BNF notation allows produc- 
tions of the form 



n " ^1 ^*32^ ... 

as an abbreviation for the context-free productions 




We define the size of the BNF production in terms of the size of 
the corresponding context free productions, namely 

k 

k 4. 2 ' ' 

i = 1 

Since there are k-1 plus-signs in the BNF production, the size of 
BNF productions can also be determined by simply counting the 
tokens they contain. 

Another common notation for context-free grammars allows the 
use of parenthesized lists of alternatives. A production of the 
form 



n ~ R ... 
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"le^ns the sane as 

s = > . . . + Sj^ 

The size of the latter can be computed from the extended produc- 
tion if each of the parentheses is counted as one token. Similai 



conversions can be 


found for other notations for context-fre< 



grammars . 



Note that the 


number of productions in a BNF or extended BNi 


grammar is just 


n, the number of non-terminal s . We define th< 


right-hand size of 


a BNF or extended BNF grammar G to be H(G) 


!Gl-n. Obviously, 


this is obtained by counting everything to th( 


right of the equal- 


-signs . 


In Table 1 we 


show the size of the context-free grammars foi 



BASIC, Pascal, Algol-60, and Ada. Since several of thes< 
languages are expressed in extended-BNF notations, conversioi 
factors like those described above have been used. 

The size measure we have defined can also be applied tc 
parts of a language's grammar. This is useful for comparing th< 



relative size of a 


language's subsystems and for comparing th< 


amount of syntax 


used by different languages for cor respond inf 


subsystems. Table 


2 shows the size of the major subsystems o: 


Algol-60. Table 


3 compares Algol-60, Pascal, and Ada on th( 



basis of the proportion of their syntax devoted to various pur 
poses. The greater proportion of Pascal devoted to declaration 
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is a result of its more elaborate type system; this trend has 
continued in Ada. 

5 . Transformational Metrics 

As discussed in Method of Approach , the goal of t r an s forma t i onal 
metrics is to measure the complexity of the relationship between 
the syntax and semantics of a language. We do this by measuring 
the size of a context-free translation grammar that maps the 
source constructs into an abstract language representing the 
meaning of the constructs. 

Translation grammars are commonly written as sets of 
tr ansformation rules. For example, the following production is a 
transformation rule that maps certain expressions from infix to 
prefix form: 

E = E+T => +ET 

+ E-T =5> -ET 

+ T T 

(Of course, the left-most plus-sign in each line indicates alter- 
nation in the BNF rule; the other plus-signs are terminal sym- 
bol s . ) 

The notation above is not general since there may be several 
occurrences of the same non-terminal on the left. This results 
in an ambiguity in the corr espond ence with the non-terminals on 
the right. For this reason, a more general notation for transla- 
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tion t^ranmars uses natural numbers on the right to refer to 
cor respona mg non -ter m i na 1 s on tne left. For example: 



E = E+T 




+ 12 


+ E-T 




-12 


+ T 




1 


Thus , in '+12', 


» 1 » 


refers to the first non-terminal on the left. 



namely ' E ' . 

These considerations lead to the following definition: A 

context - free translation grammar is a quintuple, 

G = <T, S, N, P, g> 

where T is a finite set of analysis terminal symbols, S is a fin- 
ite set of synthesis terminal symbols, N is a finite set of non- 
terminal symbols, and P is a finite set of transformation rules. 
A transformation rule is an element of 

N X V X W 

where V = TUM is the analysis vocabulary , and W = SU^fat (where 
Nat is the natural numbers) is the synthesis vocabulary . 

A 3NF translation rule such as 

E = E+T +12 

+ E-T =P -12 

+ T 1 

can be translated into the equivalent context-free translation 
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rules 



E 


= E+T 




+ 12 


E 


= E-T 




-12 


E 


= t 




1 



which are represented by the triples 



<E , 


^ E , + 


, T>, <+, 


1 , 2>> 


<E, 


<E, - 


, T>, <-, 


1 , 2>> 


<E, 


<T> , 


< 1 >> 





We define the analysis size of a translation grammar G to be the 
total size of the analysis parts of the rules: 



A(G) 



2 !S1 

<n,S,T> € ? 



Similarly, the synthesis size is the total size of the synthesis 
parts of the rules: 



S(G) = 2 !T! 

<n,S,T> i P 

Finally, the total size of the grammar is defined: 

IG! = IPl + A(G) + S(G) 

Note that 1P1+A(G) is the size of the context-free grammar 
corresponding to the translation grammar G. 

As with the syntactic metrics defined earlier, this 
transformational metric can be computed by counting the tokens in 
a translation grammar, ignoring the * = ' and ' =?> ' signs. 
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ConsiJer the simple translation grammar in Figure 1, which maps 
infix arithmetic expressions into prefix. Measuring it yields: 

A(G) = 19 
S(G) = 17 
IPl = 9 



!G| = 45 

The author used one variant of this approach to design the exter- 
nal appearance of the 8086 microprocessor for Intel Corporation. 
In this case a translation grammar was formulated that mapped an 
assembly-language level view of the machine into the various 
primitive operations it provided. The complexities of alternate 
views were then estimated by measuring the size of the associated 
translation grammars. The premise underlying this approach was 
that the syntactic complexity of a language was a function of the 
complexity of the mapping from the language into its semantic 
constituents. This mapping was, in essence, what the programmer 
had to learn in order to use the machine. This technique 
resulted in a number of improvements in the apparent simplicity 
of the 8086. 

6 . Semantic Metrics 

In this section we consider methods for measuring the semantic 
complexity of structural subsystems of a programming language. 
That is, we are interested in measuring the complexity of the 
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semantic interrelationships without regard for the complexity of 
the syntax with which they are expressed. 

Consider a subset of the Pascal type system with primitive 
types real , integer , Boolean , and char and with the array and set 
type constructor 3 . The allowable interrelationships among these 
types can be expressed by domain equations such as these: 

T = D + R + array(X,T) + set(X) 

D = X + I 

X = 3 + C + subrange! el(D), el(D)) 

where the plus-sign denotes disjoint union, upper-case letters 
represent domains (T=type, D=discrete type, R=real , X=index type, 
I=integer, B=Boolean, C=char), and words beginning with lower- 
case letters denote functions on the domains. For example, 
'set(S)' is the power-set of S and ’array(D,R)' is the set of all 
(continuous) functions from D to R. 



The number of restrictions and special cases inherent in a 
subsystem of a programming language will be reflected in the com- 
plexity of the domain equations required to describe that subsys- 
tem. We can measure the complexity of these equations by replac- 
ing them by an equivalent context-free grammar: 



T = D + r + aXTi-sX 
D = X + i 
X = b + c + deDeD 



This has the terminal symbols 'r', 'i', 'b', and 'c' 



15 



SIMPLE METRICS FOR PROGRAMMING LANGUAGES 



correspond ing to the prii-nitive types, and the terminal symbols 
’a', 'd', 's', and 'e' corresponding to the type constructors. 
We have eliminated parentheses by representing function applica- 
tions in prefix form (hence, we essentially have a tree grammar). 
The resulting grammar generates the language of all type struc- 
tures defined by the equations, i.e., 

{ r, b, c, i, sb, sc, abr, abb, abc , abbi, absb, ... } 

'We can measure the size of this grammar : 25. 

A semantic gr ammar is a BMF grammar in which the right-hand 
sides of the productions are r epr esentations of domain expres- 
sions. That is, the strings between the plus-signs are either 
(1) non-terminal s , (representing non-primitive domains), (2) 
niladic terminals ( representing primitive domains), or (3) n-adic 
terminals ( representing constructor functions) followed by n 
argument strings, each representing either a domain (primitive or 
non-primitive), or a constructor function with its arguments. 
Figure 2 shows a syntactic grammar for the Pascal type system; 
Figure 3 shows the corresponding semantic grammar. 

Another way to view a semantic subsystem of a language is 
through a dependency graph like that in Figure 4, which 
corresponds to the semantic grammar : 

T = D + r + aXT + sX 

D = X + i 

X = b + c + deDeD 
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In the graph the dependencies anong the parts of tne type system 
become apparent: a type depends on the definition of another 

type if there is an edge leading from the latter to the former. 
Hence, recursive definitions are represented by cycles and primi- 
tive domains are represented by initial nodes. The output from a 
node can lead to exactly one other node, although this latter 
node may be a f an - out operation (represented by a small dot), 
which can have any number of outputs. The output of the entire 
graph is always required to be a fan-out operation. 

How can we measure the complexity of such a graph? The 
nodes represent the concepts (types, in this case) that are 
defined by the system and the edges represent the dependencies 
among the definitions. Therefore , since one notion of the com- 
plexity of a system is just the number of dependencies among its 
parts, one way to measure the complexity is to count the edges in 
the dependency graph. In this example it is 22. 

We now relate the complexity measures for semantic grammars 
and dependency graphs. 

Theorem: Let G be a semantic grammar and let p be the 
correspond ing dependency graph. Let E(T) represent the number of 
edges in f, and F(T) represent the number of fan-out nodes in 
Then : 



R(G) = E(P) 
N(G) = F(p) 
|G! = E(p)+F(p) 
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where N(G) is the size of the non-termina 1 vocabulary of G 
(which is also the number of productions in a EfJF grammar). 

proof : We sketch the proof informally. The method of con- 
structing the dependency graph from a grammar will make the truth 
of the theorem obvious. Repeat the following procedure for each 
production in the grammar : 

For each production 'n = S’, add a fan-out node labeled ’n’ 
to the graph. Hence, the number of fan-out nodes will equal the 
number of non-terminals , since in a BNF grammar the number of 
productions is the same as the number of non-terminals. Thus, 
N(G) = F(T) . 

Suppose that S (in the production 'n=S') has the form U+V; 
add to the graph a plus-node whose inputs are U and V and whose 
output is the fan-out node for n. The plus-sign in the produc- 
tion corresponds to the edge from the plus-node to the fan-out 
node. Continue this process if either U or V contains plus-signs 
by adding new plus- nodes whose outputs lead to previously added 
plus-nodes. Hence, the number of edges leading from plus-nodes 
is the number of plus-signs in the grammar . 

Next consider a terminal string S that does not contain a 
plus-sign. If S is a single niladic terminal symbol t, then add 
an initial node to the graph with an edge leading out from it. 
Hence, the number of edges leading from initial nodes is the 
number of occurrences of niladic terminal symbols. 
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If 3 is a single non-terminal symbol n, then construct an 
edge leading from the fan-out node labeled n. Hence, the numoer 
of edges leading from fan-out nodes is the number of occurrences 
of non-terminal symbols on the right-hand side of rules. 

Finally, suppose S is a string 

fS^S^. . . 3,^ 

where f is a non-niladic terminal symbol representing an operator 
and the 3^ are strings representing the arguments of that opera- 
tion. Add a node representing an operation f and recursively 
process its arguments. Hence, the number of edges leaving opera- 
tor nodes is the number of non-niladic terminal symbols in the 
g rammar . 

Since every edge must leave either a fan-out node, an ini- 
tial node, or an operator node, the total number of edges is the 
total of the number of occurrences of non-terminals, niladic ter- 
minals, and non-niladic terminals. Hence, the number of edges is 
just the total number of symbols that occur on the right of tne 
BNF rules, so R(G) = E(D. QED . 

Both the grammar-oriented and graph-or ien ted approaches have 
been applied to measuring the semantic complexity of the data, 
control, and name structures of several programming languages. 
These studies are reported in [3] and [11]* 
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7 . Valiriation of Metrics 

There remains the important question, How are these measures 
validated? To put it another way, we have an informal under- 
standing of complexity; how can we make it formal ? Firstly, our 
formal measure must agree with our informal judgements in most 
cases. For instance, the measure should show that the data 
structures of Algol-60 are simpler than those of Pascal. This 
aspect of the validation could be backed up with formal psycho- 
logical tests, but this does not seem necessary. Psychological 
validation has not been required for concepts such as "comput- 
able": the formal definition seems to correspond to the infor- 
mal, although no formal proof of the correspondence is possible. 

Secondly, we can determine if the formal measure satisfies 
the same properties as the informal. For instance, the measure 
should be additive in those aspects that the informal idea is 
additive. An example of this comes from information theory; we 
expect the information capacity of two pages to be approximately 
the sum of the information capacities of the separate pages. It 
is easy to see that the formal definition of information capacity 
satisfies this property. 

Finally, the formal measure should be productive ; that is, 
it should lead to a rich theory with good predictive abilities 
and explanatory power. Information theory is a perfect example. 
Of course, it is difficult to evaluate a measure on this basis 
until a substantial amount of experience in its use has accumu- 
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1 ated . 

8 . Conclus ions 

In this paper have defined tnree simple metrics that can be 
applied to programming language design. The first is a syntactic 
metric that is determined by counting the tokens in a context- 
free grammar for a language or a part of a language. This allows 
a language designer to estimate the total syntactic complexity of 
a language and to measure the relative proportion of a language's 
syntax devoted to different purposes. 

The second metric is a transformational metric that is 
determined by the number of tokens in a translation grammar that 
maps the source language into an abstract language reflecting the 
basic semantic notions of the language. This metric allows the 
language designer to evaluate the complexity of the relationship 
between a language's syntax and semantics. Like the syntactic 
metric, it can be applied to the entire language or to particular 
parts . 

Next we defined a semantic metric that is determined by the 
number of tokens in a context-free grammar that describes the 
dependencies among the semantic primitives. This metric was 
shown to be equivalent to a metric based on the number of nodes 
and edges in the cor r espond ing semantic dependency graph. The 
semantic metric is most usefully applied to well-defined semantic 
subsystems of a programming language, such as its control struc- 
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ture, name structure, ani lata type systems. This pe'*'nits the 
comparison of tne complexity of tne dependencies in correspond ing 
systems in different languages. 

Finally we discussed the validation of metrics like those 
defined in this paper. We argued that these metrics must be 
validated by their integration with existing theories and by 
their usefulness, rather than by psychological demonstrations of 
their relationship with perceived qualities. As it has in the 
natural sciences, the objective approach is more likely to pro- 
duce testable, widely applicable theories than is the subjective 
approach. 
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TABLE 1. Comparison of Sizes of Entire Languages 



Language 


Total Grammar Size 


BASIC 


396 


Pascal 


541 


Algol -60 


603 


Ada 


1614 



TABLE 2. Sizes of Subsystems of Aigol-60 



Subsystem 


Size (tokens) 


Lexics 


69 


Expressions 


210 


Statements 


177 


Declarations 


147 



Total 603 

TABLE 3* Subsystem Proportions of Algol-60, Pascal, and Ada 



Subsystem 


Algol-60 (J) 


Pascal (%) 


Ada (%) 


Lexics 


1 1 


1 4 


3 


Expressions 


35 


23 


1 6 


Statements 


29 


22 


22 


Declarations 


24 


41 


54 



Total 



99 



100 



100 
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p 


Z 


E 


-h 


T 




sum 


1 


2 




+ 




- 


T 




d i f 


1 


2 




4- 


T 








1 
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i 
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ik 


p 




prd 
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quo 
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Figure 1. Translation Grammar for Arithmetic Expressions 
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type 


= 


type id 




+ 


id list 






constant . . constant 






T type id 




-K 


PACKED structured type 






structured type 


id list 


= 


id + id , id list 


structured type 




ARRAY [ type list ] OF type 






RECORD field list END 






RECORD field list variant part END 




+ 


FILE OF type 




4- 


SET OF type 


field list 


= 


€ + id list : type ; f ield_l ist 


variant part 


= 


CASE opt id type id OF variant list 


opt id 




id : + € 


variant list 




variant variant ; variant list 


variant 


= 


case labels : ( field list ) 


case labels 




constant + constant , case labels 



Figure 2. Syntactic Grammar of Pascal Type System 
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c y p e 


= REAL 

+ oiscrer.e type 
+ PTR type 

+ PACKED structured type 
+ structured type 


discrete type 


= INTEGER + index type 


index type 


= BOOLEAN + CHAR + POWERSET id 
+ SUBRMG const const 


const 


= SELECT discrete type 


structured type 


: ARRAY index type type 
+ SET index type 
+ FILE type 
+ RECORD field list 
+ RECORD field list variant part 


field list 


r € * CONS PAIR id type field_list 


variant part 


= CASE opt id index type variant list 


opt id 


= id ^ € 


variant list 


: variant + COM3 variant variant list 


variant 


= PAIR constant field list 


Figure 3- Semant 


:ic Grammar for Pascal Type System 
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Figure 4. Diagram of Subset of Pascal Type System 
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